124 research outputs found

    Distributionally Robust Classification on a Data Budget

    Full text link
    Real world uses of deep learning require predictable model behavior under distribution shifts. Models such as CLIP show emergent natural distributional robustness comparable to humans, but may require hundreds of millions of training samples. Can we train robust learners in a domain where data is limited? To rigorously address this question, we introduce JANuS (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions, and perform a series of carefully controlled investigations of factors contributing to robustness in image classification, then compare those results to findings derived from a large-scale meta-analysis. Using this approach, we show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples. To our knowledge, this is the first result showing (near) state-of-the-art distributional robustness on limited data budgets. Our dataset is available at \url{https://huggingface.co/datasets/penfever/JANuS_dataset}, and the code used to reproduce our experiments can be found at \url{https://github.com/penfever/vlhub/}.Comment: TMLR 2023; openreview link: https://openreview.net/forum?id=D5Z2E8CNs

    When Do Neural Nets Outperform Boosted Trees on Tabular Data?

    Full text link
    Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this work, we take a step back and question the importance of this debate. To this end, we conduct the largest tabular data analysis to date, comparing 19 algorithms across 176 datasets, and we find that the 'NN vs. GBDT' debate is overemphasized: for a surprisingly high number of datasets, either the performance difference between GBDTs and NNs is negligible, or light hyperparameter tuning on a GBDT is more important than choosing between NNs and GBDTs. A remarkable exception is the recently-proposed prior-data fitted network, TabPFN: although it is effectively limited to training sets of size 3000, we find that it outperforms all other algorithms on average, even when randomly sampling 3000 training datapoints. Next, we analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well. For example, we find that GBDTs are much better than NNs at handling skewed or heavy-tailed feature distributions and other forms of dataset irregularities. Our insights act as a guide for practitioners to determine which techniques may work best on their dataset. Finally, with the goal of accelerating tabular data research, we release the TabZilla Benchmark Suite: a collection of the 36 'hardest' of the datasets we study. Our benchmark suite, codebase, and all raw results are available at https://github.com/naszilla/tabzilla.Comment: NeurIPS Datasets and Benchmarks Track 202

    Comparative economic evaluation of data from the ACRIN national CT colonography trial with three cancer intervention and surveillance modeling network microsimulations

    Get PDF
    Purpose: To estimate the cost-effectiveness of computed tomographic (CT) colonography for colorectal cancer (CRC) screening in average-risk asymptomatic subjects in the United States aged 50 years. Materials and Methods: Enrollees in the American College of Radiology Imaging Network National CT Colonography Trial provided informed consent, and approval was obtained from the institutional review board at each site. CT colonography performance estimates from the trial were incorporated into three Cancer Intervention and Surveillance Modeling Network CRC microsimulations. Simulated survival and lifetime costs for screening 50-year-old subjects in the United States with CT colonography every 5 or 10 years were compared with those for guideline-concordant screening with colonoscopy, flexible sigmoidoscopy plus either sensitive unrehydrated fecal occult blood testing (FOBT) or fecal immunochemical testing (FIT), and no screening. Perfect and reduced screening adherence scenarios were considered. Incremental cost-effectiveness and net health benefits were estimated from the U.S. health care sector perspective, assuming a 3% discount rate. Results: CT colonography at 5- and 10-year screening intervals was more costly and less effective than FOBT plus flexible sigmoidoscopy in all three models in both 100% and 50% adherence scenarios. Colonoscopy also was more costly and less effective than FOBT plus flexible sigmoidoscopy, except in the CRC-SPIN model assuming 100% adherence (incremental cost-effectiveness ratio: 26300perlifeyeargained).CTcolonographyat5and10yearscreeningintervalsandcolonoscopywerenetbeneficialcomparedwithnoscreeninginallmodelscenarios.The5yearscreeningintervalwasnetbeneficialoverthe10yearintervalexceptintheMISCANmodelwhenassuming10026 300 per life-year gained). CT colonography at 5- and 10-year screening intervals and colonoscopy were net beneficial compared with no screening in all model scenarios. The 5-year screening interval was net beneficial over the 10-year interval except in the MISCAN model when assuming 100% adherence and willingness to pay 50 000 per life-year gained. Conclusion: All three models predict CT colonography to be more costly and less effective than non-CT colonographic screening but net beneficial compared with no screening given model assumptions

    Avant-garde and experimental music

    No full text

    Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)

    Get PDF
    In 2008 we published the first set of guidelines for standardizing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in different organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes. For example, a key point that needs to be emphasized is that there is a difference between measurements that monitor the numbers or volume of autophagic elements (e.g., autophagosomes or autolysosomes) at any stage of the autophagic process versus those that measure fl ux through the autophagy pathway (i.e., the complete process including the amount and rate of cargo sequestered and degraded). In particular, a block in macroautophagy that results in autophagosome accumulation must be differentiated from stimuli that increase autophagic activity, defi ned as increased autophagy induction coupled with increased delivery to, and degradation within, lysosomes (inmost higher eukaryotes and some protists such as Dictyostelium ) or the vacuole (in plants and fungi). In other words, it is especially important that investigators new to the fi eld understand that the appearance of more autophagosomes does not necessarily equate with more autophagy. In fact, in many cases, autophagosomes accumulate because of a block in trafficking to lysosomes without a concomitant change in autophagosome biogenesis, whereas an increase in autolysosomes may reflect a reduction in degradative activity. It is worth emphasizing here that lysosomal digestion is a stage of autophagy and evaluating its competence is a crucial part of the evaluation of autophagic flux, or complete autophagy. Here, we present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macroautophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes. These guidelines are not meant to be a formulaic set of rules, because the appropriate assays depend in part on the question being asked and the system being used. In addition, we emphasize that no individual assay is guaranteed to be the most appropriate one in every situation, and we strongly recommend the use of multiple assays to monitor autophagy. Along these lines, because of the potential for pleiotropic effects due to blocking autophagy through genetic manipulation it is imperative to delete or knock down more than one autophagy-related gene. In addition, some individual Atg proteins, or groups of proteins, are involved in other cellular pathways so not all Atg proteins can be used as a specific marker for an autophagic process. In these guidelines, we consider these various methods of assessing autophagy and what information can, or cannot, be obtained from them. Finally, by discussing the merits and limits of particular autophagy assays, we hope to encourage technical innovation in the field

    An Optimization Model for Technology Adoption of Marginalized Smallholders: Theoretical Support for Matching Technological and Institutional Innovations

    Full text link

    Methodological Review and Revision of the Global Hunger Index

    Full text link

    Harvesting Solar Power in India

    Full text link

    Social Safety Nets for Food and Nutritional Security in India

    Full text link
    corecore